fix(commands): render templates online in apply to resolve lookups#119
fix(commands): render templates online in apply to resolve lookups#119
Conversation
Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
📝 WalkthroughWalkthroughApply now renders templates using a live Talos client (so lookup()-dependent templates can resolve discovery data). A new in-file helper Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as "CLI: apply"
participant Helper as "withApplyClient"
participant Client as "Talos Client"
participant Engine as "engine.Render"
participant API as "c.ApplyConfiguration"
participant Printer as "helpers.PrintApplyResults"
CLI->>Helper: start apply flow
Helper->>Client: create/select client (maintenance / node-target / insecure)
Client->>Engine: render templates (lookup() can query client)
Engine-->>Client: rendered configuration
Client->>API: ApplyConfiguration(rendered configuration)
API-->>Printer: apply response
Printer-->>CLI: output results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the apply command to support online template rendering, enabling Helm lookup() functions to resolve real discovery data by connecting to the node before rendering. It introduces a withApplyClient helper to centralize client management across different execution paths. Review feedback identifies a critical bug in the direct patch path where the err variable is shadowed, causing errors to be silently ignored. Additionally, it is recommended to use %w for error wrapping to preserve context and allow for error inspection.
| err = withApplyClient(func(ctx context.Context, c *client.Client) error { | ||
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) | ||
|
|
||
| return WithClientNoNodes(wrappedF) | ||
| } | ||
| resp, err := c.ApplyConfiguration(ctx, &machineapi.ApplyConfigurationRequest{ | ||
| Data: result, | ||
| Mode: applyCmdFlags.Mode.Mode, | ||
| DryRun: applyCmdFlags.dryRun, | ||
| TryModeTimeout: durationpb.New(applyCmdFlags.configTryTimeout), | ||
| }) | ||
| if err != nil { | ||
| return fmt.Errorf("error applying new configuration: %s", err) | ||
| } | ||
|
|
||
| err = withClient(func(ctx context.Context, c *client.Client) error { | ||
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) | ||
| helpers.PrintApplyResults(resp) | ||
|
|
||
| resp, err := c.ApplyConfiguration(ctx, &machineapi.ApplyConfigurationRequest{ | ||
| Data: result, | ||
| Mode: applyCmdFlags.Mode.Mode, | ||
| DryRun: applyCmdFlags.dryRun, | ||
| TryModeTimeout: durationpb.New(applyCmdFlags.configTryTimeout), | ||
| return nil | ||
| }) |
There was a problem hiding this comment.
The err variable is shadowed in this else block (due to the use of := at lines 146 and 151), which means the assignment at line 156 does not update the err variable checked at line 174. Consequently, any error returned by withApplyClient in the direct patch path will be silently ignored, and the loop will incorrectly proceed to the next file. It is safer to check the error immediately within the block. Additionally, using %w for error wrapping is preferred over %s to maintain error context.
| err = withApplyClient(func(ctx context.Context, c *client.Client) error { | |
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) | |
| return WithClientNoNodes(wrappedF) | |
| } | |
| resp, err := c.ApplyConfiguration(ctx, &machineapi.ApplyConfigurationRequest{ | |
| Data: result, | |
| Mode: applyCmdFlags.Mode.Mode, | |
| DryRun: applyCmdFlags.dryRun, | |
| TryModeTimeout: durationpb.New(applyCmdFlags.configTryTimeout), | |
| }) | |
| if err != nil { | |
| return fmt.Errorf("error applying new configuration: %s", err) | |
| } | |
| err = withClient(func(ctx context.Context, c *client.Client) error { | |
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) | |
| helpers.PrintApplyResults(resp) | |
| resp, err := c.ApplyConfiguration(ctx, &machineapi.ApplyConfigurationRequest{ | |
| Data: result, | |
| Mode: applyCmdFlags.Mode.Mode, | |
| DryRun: applyCmdFlags.dryRun, | |
| TryModeTimeout: durationpb.New(applyCmdFlags.configTryTimeout), | |
| return nil | |
| }) | |
| if err := withApplyClient(func(ctx context.Context, c *client.Client) error { | |
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) | |
| resp, err := c.ApplyConfiguration(ctx, &machineapi.ApplyConfigurationRequest{ | |
| Data: result, | |
| Mode: applyCmdFlags.Mode.Mode, | |
| DryRun: applyCmdFlags.dryRun, | |
| TryModeTimeout: durationpb.New(applyCmdFlags.configTryTimeout), | |
| }) | |
| if err != nil { | |
| return fmt.Errorf("error applying new configuration: %w", err) | |
| } | |
| helpers.PrintApplyResults(resp) | |
| return nil | |
| }); err != nil { | |
| return err | |
| } |
| if err != nil { | ||
| return fmt.Errorf("error applying new configuration: %s", err) | ||
| } |
There was a problem hiding this comment.
Use %w instead of %s to wrap the error. This is a best practice in Go as it allows the caller to use errors.Is or errors.As to inspect the underlying error.
| if err != nil { | |
| return fmt.Errorf("error applying new configuration: %s", err) | |
| } | |
| if err != nil { | |
| return fmt.Errorf("error applying new configuration: %w", err) | |
| } |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/commands/apply.go (1)
146-176:⚠️ Potential issue | 🔴 CriticalCritical: Variable shadowing causes errors to be silently ignored in the direct patch path.
Line 146 declares a new
errvariable via:=that shadows the outererrfrom line 108. Line 156's assignment toerrgoes to this shadowed variable. When the else block ends at line 173, the shadowederrgoes out of scope, and line 174 checks the outererr(which isnilfrom the successfulprocessModelineAndUpdateGlobalscall).This means errors from
withApplyClientin the direct patch path are silently ignored.🐛 Proposed fix to eliminate variable shadowing
Declare
configBundleandmachineTypebefore the assignment to avoid creating a newerr:} else { // Direct patch path: apply config file as patch against empty bundle opts := buildApplyPatchOptions(withSecretsPath) patches := []string{"@" + configFile} - configBundle, machineType, err := engine.FullConfigProcess(ctx, opts, patches) + var configBundle *v1alpha1.ConfigBundle + var machineType machine.Type + configBundle, machineType, err = engine.FullConfigProcess(ctx, opts, patches) if err != nil { return fmt.Errorf("full config processing error: %w", err) }You'll need to add the appropriate imports for the types (likely from
github.com/siderolabs/talos/pkg/machinery/config/bundleandgithub.com/siderolabs/talos/pkg/machinery/config/machine).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/commands/apply.go` around lines 146 - 176, The direct-patch path is shadowing err because configBundle, machineType are created with := in the block; predeclare configBundle and machineType (and err if needed) before calling engine.FullConfigProcess so that the assignment uses = instead of :=, ensuring the err assigned inside the block is the same outer err checked after the block; adjust the declaration around engine.FullConfigProcess and withApplyClient (referencing engine.FullConfigProcess, configBundle, machineType, err, withApplyClient, helpers.PrintApplyResults) and add the necessary imports for the bundle/machine types used for configBundle and machineType.
🧹 Nitpick comments (1)
pkg/commands/apply.go (1)
134-136: Consider using%wfor error wrapping to preserve the error chain.Using
%sformats the error as a string, losing the ability to useerrors.Is()orerrors.Unwrap()on the returned error. The same applies to line 166.♻️ Suggested fix
- return fmt.Errorf("error applying new configuration: %s", err) + return fmt.Errorf("error applying new configuration: %w", err)And similarly for line 166:
- return fmt.Errorf("error applying new configuration: %s", err) + return fmt.Errorf("error applying new configuration: %w", err)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/commands/apply.go` around lines 134 - 136, Replace the fmt.Errorf calls that format the error with "%s" so they wrap the original error using "%w" to preserve the error chain (allowing errors.Is / errors.Unwrap). Specifically, locate the fmt.Errorf call that returns fmt.Errorf("error applying new configuration: %s", err) and change it to use "%w" to wrap err, and do the same for the other occurrence mentioned (the fmt.Errorf at the later return around line 166); keep the descriptive message unchanged but use "%w" so the returned error preserves the original error value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/commands/apply.go`:
- Around line 146-176: The direct-patch path is shadowing err because
configBundle, machineType are created with := in the block; predeclare
configBundle and machineType (and err if needed) before calling
engine.FullConfigProcess so that the assignment uses = instead of :=, ensuring
the err assigned inside the block is the same outer err checked after the block;
adjust the declaration around engine.FullConfigProcess and withApplyClient
(referencing engine.FullConfigProcess, configBundle, machineType, err,
withApplyClient, helpers.PrintApplyResults) and add the necessary imports for
the bundle/machine types used for configBundle and machineType.
---
Nitpick comments:
In `@pkg/commands/apply.go`:
- Around line 134-136: Replace the fmt.Errorf calls that format the error with
"%s" so they wrap the original error using "%w" to preserve the error chain
(allowing errors.Is / errors.Unwrap). Specifically, locate the fmt.Errorf call
that returns fmt.Errorf("error applying new configuration: %s", err) and change
it to use "%w" to wrap err, and do the same for the other occurrence mentioned
(the fmt.Errorf at the later return around line 166); keep the descriptive
message unchanged but use "%w" so the returned error preserves the original
error value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a2d488da-3168-404d-8036-2e4a255a7165
📒 Files selected for processing (2)
pkg/commands/apply.gopkg/commands/apply_test.go
Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/commands/apply.go`:
- Line 121: The fmt.Printf call that prints slice values uses %s which
misformats []string; update the format specifier to use %v for GlobalArgs.Nodes
and GlobalArgs.Endpoints in the Printf on the line with fmt.Printf("- talm:
file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes,
GlobalArgs.Endpoints) and likewise update the other occurrence (around the
second reported location) so both slices are formatted with %v instead of %s.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| } | ||
|
|
||
| err = withApplyClient(func(ctx context.Context, c *client.Client) error { | ||
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the code around lines 121 and 157 in pkg/commands/apply.go
head -160 pkg/commands/apply.go | tail -50Repository: cozystack/talm
Length of output: 2030
🏁 Script executed:
# Also check what GlobalArgs is and its type definition
rg "GlobalArgs\s*=" pkg/commands/ -A 2 -B 2 | head -30Repository: cozystack/talm
Length of output: 393
🏁 Script executed:
# Search for GlobalArgs struct/variable definition
rg "GlobalArgs" pkg/ -t go | grep -E "(var|const|type).*GlobalArgs" | head -20Repository: cozystack/talm
Length of output: 105
🏁 Script executed:
# Find the definition of global.Args struct
fd "args.go" pkg/ --type fRepository: cozystack/talm
Length of output: 40
🏁 Script executed:
# Search for Args struct in global package
rg "type Args struct" pkg/ -A 20Repository: cozystack/talm
Length of output: 40
🏁 Script executed:
# Find all Go files to locate the global package
find pkg -name "*.go" | xargs grep -l "package global"Repository: cozystack/talm
Length of output: 42
🏁 Script executed:
# Search for Nodes and Endpoints field declarations
rg "Nodes|Endpoints" pkg/ -t go -B 2 -A 2 | head -50Repository: cozystack/talm
Length of output: 2941
🏁 Script executed:
# List all Go files in pkg and look for global-related files
ls -la pkg/Repository: cozystack/talm
Length of output: 503
🏁 Script executed:
# Search more broadly for Args struct definition in entire codebase
rg "Nodes.*\[\]string|Endpoints.*\[\]string" pkg/ -t go -B 5 -A 5 | head -60Repository: cozystack/talm
Length of output: 3336
🏁 Script executed:
# Verify the exact context around GlobalArgs type assignment
grep -n "GlobalArgs" pkg/commands/root.go -A 5 -B 5Repository: cozystack/talm
Length of output: 3573
🏁 Script executed:
# Check if global.Args is imported from talos or defined locally
rg "import.*global|from.*global" pkg/commands/ -t goRepository: cozystack/talm
Length of output: 155
Use %v instead of %s for []string values in format strings.
GlobalArgs.Nodes and GlobalArgs.Endpoints are []string types. The %s format specifier is for single strings and will print Go's error-like formatting (e.g., %!s([]string=[...])) instead of readable values. Use %v to properly display slices.
🩹 Proposed fix
- fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints)
+ fmt.Printf("- talm: file=%s, nodes=%v, endpoints=%v\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints)Also applies to: 157-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/commands/apply.go` at line 121, The fmt.Printf call that prints slice
values uses %s which misformats []string; update the format specifier to use %v
for GlobalArgs.Nodes and GlobalArgs.Endpoints in the Printf on the line with
fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile,
GlobalArgs.Nodes, GlobalArgs.Endpoints) and likewise update the other occurrence
(around the second reported location) so both slices are formatted with %v
instead of %s.
lexfrei
left a comment
There was a problem hiding this comment.
Requesting changes
The fix itself is correct — moving rendering inside the client callback so lookup() functions resolve real discovery data is the right approach, and the shadow/wrap bugs flagged by the bots have been properly addressed. However, the PR is missing the most important artifact: a test that proves the bug from the reports.
Required: regression test for the actual bug
The test plan only flips Offline=true → Offline=false in TestBuildApplyRenderOptions. That verifies a struct field value, not behavior. The real bug was:
applyrenders templates offline →lookup()returns empty maps- Helpers like
talm.discovered.default_link_name_by_gatewayproduce empty strings - Rendered config has
interface: ""(or missing) - Talos 1.12 rejects it with
[networking.os.device.interface], [networking.os.device.deviceSelector]: required either config section to be set
Please add a regression test that exercises this path end-to-end. Suggested shape:
- Use a fake
LookupFuncthat returns distinguishable values online vs empty offline (the codebase already swapshelmEngine.LookupFuncin existing render tests — seepkg/engine/render_test.go) - Render a minimal chart template that uses
talm.discovered.default_link_name_by_gateway(or any helper that consumeslookup) - Assert: with
Offline=true, the rendered config has an empty interface field (reproduces the bug) - Assert: with
Offline=false+ the fake lookup, the rendered config has a populated interface field (proves the fix) - Bonus: run the output through
isTalosConfigPatch+ Talos config validation to show the exact error the user reported surfaces only in the broken path
Without this test, nothing prevents someone from flipping Offline back to true in a future refactor and silently reintroducing the regression.
Bonus action items (not blocking, happy to track as separate issues)
-
Multi-node regression:
engine.Rendernow callshelpers.FailIfMultiNodesin non-insecure mode. If a user puts multiple IPs in a single modelinenodes=[ip1, ip2]or passes--nodes ip1,ip2, apply will now fail before rendering. The typical single-node-per-file flow from the README still works, and maintenance mode (-i) is unaffected, but the behavior change is worth documenting or rejecting explicitly with a clearer message. -
Misleading error message:
helpers.FailIfMultiNodes(ctx, "talm template")hardcodes the command name inpkg/engine/engine.go:210. A user runningtalm applywill see an error abouttalm template, which is confusing. Consider threading the caller command name throughengine.Options. -
Vestigial outer
errcheck: after the direct-patch refactor usesif err := withApplyClient(...); err != nil { return err }, the outerif err != nilat the end of the loop body only catches the template path. Consider restructuring so both branches return errors uniformly — the current state is fragile to future refactors. -
Dead field:
engine.Options.Insecureis set bybuildApplyRenderOptionsbut never read anywhere inpkg/engine/. Safe to remove.
Process suggestion: TDD for bug fixes
For future bug fixes, please consider writing the test first:
- Red: write a test that reproduces the bug and fails on current code. This proves the bug exists and the test can detect it.
- Green: apply the minimal fix that makes the test pass.
- Refactor: clean up without changing behavior.
This is especially valuable for bugs like this one, where the root cause is subtle (offline vs online lookup semantics) and the fix is a one-line config change that is easy to accidentally revert. A failing-first test is the only durable guarantee that the bug stays fixed.
Happy to help with the test if useful — the pkg/engine/render_test.go pattern from #116 (offline rendering tests with a swappable LookupFunc) would be a good starting point.
Thanks for the quick turnaround on this one!
Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/engine/render_test.go`:
- Around line 154-183: The test TestRenderOfflineSkipsLookupFunc is tautological
because it never calls Render and thus doesn't verify that Offline=true prevents
replacement of helmEngine.LookupFunc; update the test to either (A) exercise the
real render path by invoking Render with an Options{Offline:true} and a minimal
chart/client (so the code path that would set helmEngine.LookupFunc =
newLookupFunction(ctx,c) runs and you can assert the sentinel remains), or (B)
if invoking Render is impractical, rename or remove the test to avoid a
misleading assertion; reference helmEngine.LookupFunc, Options{Offline:true},
Render, newLookupFunction, and the test name TestRenderOfflineSkipsLookupFunc
when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8dc3c53f-1c9a-4389-8b56-c4c48e6c3a75
📒 Files selected for processing (2)
pkg/commands/apply_test.gopkg/engine/render_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/commands/apply_test.go
| // TestRenderOfflineSkipsLookupFunc verifies that Render with Offline=true does | ||
| // NOT replace the LookupFunc, and Offline=false does replace it. This is a | ||
| // unit check that the fix (Offline=false in apply) causes the real LookupFunc | ||
| // to be wired up. | ||
| func TestRenderOfflineSkipsLookupFunc(t *testing.T) { | ||
| origLookup := helmEngine.LookupFunc | ||
| defer func() { helmEngine.LookupFunc = origLookup }() | ||
|
|
||
| // Set a sentinel LookupFunc | ||
| helmEngine.LookupFunc = func(string, string, string) (map[string]interface{}, error) { | ||
| return map[string]interface{}{"sentinel": true}, nil | ||
| } | ||
|
|
||
| // Offline=true should leave the sentinel intact | ||
| opts := Options{Offline: true} | ||
| if !opts.Offline { | ||
| t.Fatal("test setup: expected Offline=true") | ||
| } | ||
|
|
||
| res, _ := helmEngine.LookupFunc("test", "", "") | ||
| if _, ok := res["sentinel"]; !ok { | ||
| t.Error("Offline=true must not replace LookupFunc") | ||
| } | ||
|
|
||
| // Verify: when Offline=false, Render() would call | ||
| // helmEngine.LookupFunc = newLookupFunction(ctx, c), replacing the sentinel. | ||
| // We can't call full Render without a chart/client, but the logic is: | ||
| // if !opts.Offline { helmEngine.LookupFunc = newLookupFunction(ctx, c) } | ||
| // This is tested implicitly by the online_lookup_populates_interface subtest. | ||
| } |
There was a problem hiding this comment.
TestRenderOfflineSkipsLookupFunc is tautological and doesn’t validate the claimed behavior.
On Line 168-Line 176, the test only checks the sentinel function you just assigned; it never invokes Render, so it can still pass even if Offline=true starts replacing LookupFunc. Please either call the real render path in this test or rename/remove it to avoid false confidence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/engine/render_test.go` around lines 154 - 183, The test
TestRenderOfflineSkipsLookupFunc is tautological because it never calls Render
and thus doesn't verify that Offline=true prevents replacement of
helmEngine.LookupFunc; update the test to either (A) exercise the real render
path by invoking Render with an Options{Offline:true} and a minimal chart/client
(so the code path that would set helmEngine.LookupFunc =
newLookupFunction(ctx,c) runs and you can assert the sentinel remains), or (B)
if invoking Render is impractical, rename or remove the test to avoid a
misleading assertion; reference helmEngine.LookupFunc, Options{Offline:true},
Render, newLookupFunction, and the test name TestRenderOfflineSkipsLookupFunc
when making the change.
lexfrei
left a comment
There was a problem hiding this comment.
Regression test solidly reproduces the bug (offline → empty interface, online → populated interface) and exercises the real talm.discovered.default_link_name_by_gateway lookup pattern. All tests pass locally. Thanks for the follow-up!
The bonus action items from the previous review are tracked as separate issues so they can be picked up independently.
Add support for the Talos v1.12 multi-document configuration format. When TalosVersion is set to v1.12 or later (via Chart.yaml or --talos-version CLI flag), templates generate separate YAML documents instead of the deprecated monolithic machine.network and machine.registries fields. Closes #100 Engine: - Pass TalosVersion through the Helm values map into the template rendering context (available as .TalosVersion), avoiding global mutable state - Remove unused ctx and chrt parameters from applyPatchesAndRenderConfig Chart templates (cozystack + generic): - Split talos.config into shared sub-templates to eliminate duplication: talos.config.machine.common, talos.config.cluster, talos.config.network.legacy, talos.config.network.multidoc - Version dispatcher selects legacy or multi-doc format based on semverCompare - New v1.12 document types generated: HostnameConfig, ResolverConfig, LinkConfig, BondConfig, VLANConfig, RegistryMirrorConfig, Layer2VIPConfig Bug fix: - Fix nr_hugepages rendering inside with block (was producing empty string for non-zero values due to .Values being inaccessible when dot is rebound) Backward compatibility: - Legacy format is fully preserved when TalosVersion is empty or < v1.12 - All legacy regression tests pass unchanged Tests: - isTalosConfigPatch and extractExtraDocuments tests for all 7 new document types - Offline rendering tests for both charts x both roles x both formats - Version edge cases: pre-release, two-component version strings - Concurrent render test with race detector - nr_hugepages rendering test for both legacy and multi-doc paths - Regression test from #119 for offline lookup producing empty interface Code style: - Replace interface{} with any across engine package - Use maps.Copy instead of range+assign loops - Use range over int for simple counting loops - Use strings.Repeat instead of += concatenation Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Add support for the Talos v1.12 multi-document configuration format. When TalosVersion is set to v1.12 or later (via Chart.yaml or --talos-version CLI flag), templates generate separate YAML documents instead of the deprecated monolithic machine.network and machine.registries fields. Closes #100 Engine: - Pass TalosVersion through the Helm values map into the template rendering context (available as .TalosVersion), avoiding global mutable state - Remove unused ctx and chrt parameters from applyPatchesAndRenderConfig Chart templates (cozystack + generic): - Split talos.config into shared sub-templates to eliminate duplication: talos.config.machine.common, talos.config.cluster, talos.config.network.legacy, talos.config.network.multidoc - Version dispatcher selects legacy or multi-doc format based on semverCompare - New v1.12 document types generated: HostnameConfig, ResolverConfig, LinkConfig, BondConfig, VLANConfig, RegistryMirrorConfig, Layer2VIPConfig Bug fix: - Fix nr_hugepages rendering inside with block (was producing empty string for non-zero values due to .Values being inaccessible when dot is rebound) Backward compatibility: - Legacy format is fully preserved when TalosVersion is empty or < v1.12 - All legacy regression tests pass unchanged Tests: - isTalosConfigPatch and extractExtraDocuments tests for all 7 new document types - Offline rendering tests for both charts x both roles x both formats - Version edge cases: pre-release, two-component version strings - Concurrent render test with race detector - nr_hugepages rendering test for both legacy and multi-doc paths - Regression test from #119 for offline lookup producing empty interface Code style: - Replace interface{} with any across engine package - Use maps.Copy instead of range+assign loops - Use range over int for simple counting loops - Use strings.Repeat instead of += concatenation Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
* feat(config): migrate to Talos v1.12 multi-document config format Add support for the Talos v1.12 multi-document configuration format. When TalosVersion is set to v1.12 or later (via Chart.yaml or --talos-version CLI flag), templates generate separate YAML documents instead of the deprecated monolithic machine.network and machine.registries fields. Closes #100 Engine: - Pass TalosVersion through the Helm values map into the template rendering context (available as .TalosVersion), avoiding global mutable state - Remove unused ctx and chrt parameters from applyPatchesAndRenderConfig Chart templates (cozystack + generic): - Split talos.config into shared sub-templates to eliminate duplication: talos.config.machine.common, talos.config.cluster, talos.config.network.legacy, talos.config.network.multidoc - Version dispatcher selects legacy or multi-doc format based on semverCompare - New v1.12 document types generated: HostnameConfig, ResolverConfig, LinkConfig, BondConfig, VLANConfig, RegistryMirrorConfig, Layer2VIPConfig Bug fix: - Fix nr_hugepages rendering inside with block (was producing empty string for non-zero values due to .Values being inaccessible when dot is rebound) Backward compatibility: - Legacy format is fully preserved when TalosVersion is empty or < v1.12 - All legacy regression tests pass unchanged Tests: - isTalosConfigPatch and extractExtraDocuments tests for all 7 new document types - Offline rendering tests for both charts x both roles x both formats - Version edge cases: pre-release, two-component version strings - Concurrent render test with race detector - nr_hugepages rendering test for both legacy and multi-doc paths - Regression test from #119 for offline lookup producing empty interface Code style: - Replace interface{} with any across engine package - Use maps.Copy instead of range+assign loops - Use range over int for simple counting loops - Use strings.Repeat instead of += concatenation Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * fix(engine): validate TalosVersion before template rendering Call ParseContractFromVersion early in Render() so malformed values (e.g. "latest", "foobar") surface a user-friendly "invalid talos-version" error instead of the opaque "semverCompare: invalid semantic version" that otherwise escapes from deep inside the Helm engine template evaluation. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * test(engine): cover bond/vlan topologies, invalid version, Layer2VIPConfig - TestRenderInvalidTalosVersion: exercises the new early validation. - TestMultiDocCozystack_BondTopology / TestMultiDocGeneric_BondTopology: assert BondConfig emission with mock links/routes/addresses lookups. - TestMultiDocCozystack_VlanOnBondTopology / TestMultiDocGeneric_VlanOnBondTopology: assert BondConfig + VLANConfig combined emission when the default route goes through a VLAN. - TestMultiDocCozystack_ControlPlane: add positive assert for Layer2VIPConfig since cozystack defaults set floatingIP. - TestLegacy/MultiDocCozystack_NrHugepages: restore the package-level helmEngine.LookupFunc via t.Cleanup for test isolation. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * docs: refine multi-doc README note and machine.common helper comment - README: clarify that not all multi-doc types are always present — HostnameConfig/ResolverConfig and one network interface document are always emitted, Layer2VIPConfig depends on floatingIP on controlplane, and RegistryMirrorConfig is cozystack-only. - charts/cozystack _helpers.tpl: the talos.config.machine.common helper also renders nodeLabels for controlplane nodes — update the comment. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * fix(helpers): attach Layer2VIP to VLAN interface in multi-doc path In the VLAN branch of talos.config.network.multidoc, $interfaceName is rewritten to the parent link, while addresses/routes are emitted on the VLAN link ($defaultLinkName). Using $interfaceName for Layer2VIPConfig.link moved the VIP off the configured VLAN. Select $defaultLinkName when the default link is a VLAN so the VIP stays attached to the VLAN interface, matching the legacy path behavior. Address review feedback from coderabbitai on charts/cozystack/templates/_helpers.tpl:229. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * fix(helpers): attach Layer2VIP to VLAN interface in generic multi-doc path Mirror of the cozystack fix: when the default link is a VLAN, $interfaceName is rewritten to the parent link while the VLAN addresses/routes are emitted on $defaultLinkName. Use $defaultLinkName for Layer2VIPConfig.link in that branch so the VIP stays on the VLAN. Address review feedback from coderabbitai on charts/generic/templates/_helpers.tpl:152. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> --------- Signed-off-by: Aleksei Sviridkin <f@lex.la>
Summary
talm applywith modelinetemplates=was rendering templates offline (no Talos client), so alllookup()calls returned empty values — producing configs with emptyinterfacenames[networking.os.device.interface], [networking.os.device.deviceSelector]: required either config section to be setengine.RenderwithOffline=false, solookup()functions resolve real discovery data from the target nodetalm applya true single-step "render + apply" command, matching the pattern already used bytalm templateWhat changed in
apply.goBefore (broken flow):
After (fixed flow):
Extracted inline
withClientclosure into a namedwithApplyClienthelper. Both the template rendering path and the direct patch path now create the client via this helper. For the template path,engine.Renderis called inside the client callback with the real client.Test plan
go build ./...passesgo test ./pkg/commands/ -run TestBuildApply -vpasses (updated assertion:Offline=false)go test ./pkg/... -v— all tests passtalm apply -f nodes/master1.yaml -irenders templates online with real lookups and applies to Talos v1.12 node🤖 Generated with Claude Code
Summary by CodeRabbit